nd is removed from the candidate outlier list,
ܢିൌܢି\ݖ
ି
(6.30)
candidate outlier (ݖ
ି) is also inserted into the tight cluster set,
ܢାൌܢା∪ݖ
ି
(6.31)
ݐఛ, ݖ
ି is predicted as an outlier (perhaps a high extreme outlier)
xamination of ݖ
ିݖ
ି, ∀ݖ
ି∈ܢି is terminated. If ݐ൏െݐఛ, ݖ
ି
ted as an outlier (perhaps a low extreme outlier) and the
ion of ݖ
ି൏ݖ
ି, ∀ݖ
ି∈ܢି is terminated.
cover DEGs when outlier genes are present — simulated data
ated toy data set was generated for the demonstration of
ng DEGs when there are outliers present. In this data set, 400
Gs and 50 up-regulated DEGs as well as 50 down-regulated DEGs
igned. The replicate number was 20. The number of outliers
om one to five. Outliers were inserted into 10% of non-DEGs.
ns that 40 non-DEGs had outliers. Outliers were also inserted into
DEGs. Therefore, ten down-regulated DEGs and ten up-regulated
ad outliers. Figure 6.18 shows three types of genes in this
d toy data. Figure 6.18(a) shows a non-DEG in which one control
and one case replicate were the outliers. In Figure 6.18(b), two
eplicates were the outliers for an up-regulated DEG. In Figure
wo case replicates were the outliers for a down-regulated DEG.
outliers appear, the traditional t test may not work well. Even the
t test may not work well. Figure 6.19 shows an experiment of
presence of outliers in the data has led to potential misleading
ng the t test. The error statistics were also included in the figure.
o to ten down-regulated DEGs were mis-classified. From six to
n-regulated DEGs were misclassified. Note that the maximum
of down-regulated DEGs was ten and the maximum number of
ated DEGs was ten as well. There was no prediction error for the
Gs when using the t test. Therefore, no Type I error. This shows